allocation policy
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- (4 more...)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts (0.04)
- Asia > India > Rajasthan (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Modeling & Simulation (0.67)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts (0.04)
- Asia > India > Rajasthan (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.68)
Online Allocation and Learning in the Presence of Strategic Agents
We study the problem of allocating $T$ sequentially arriving items among $n$ homogenous agents under the constraint that each agent must receive a prespecified fraction of all items, with the objective of maximizing the agents' total valuation of items allocated to them. The agents' valuations for the item in each round are assumed to be i.i.d.
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
Online Learning for Dynamic Vickrey-Clarke-Groves Mechanism in Unknown Environments
Leon, Vincent, Etesami, S. Rasoul
We consider the problem of online dynamic mechanism design for sequential auctions in unknown environments, where the underlying market and, thus, the bidders' values vary over time as interactions between the seller and the bidders progress. We model the sequential auctions as an infinite-horizon average-reward Markov decision process (MDP). In each round, the seller determines an allocation and sets a payment for each bidder, while each bidder receives a private reward and submits a sealed bid to the seller. The state, which represents the underlying market, evolves according to an unknown transition kernel and the seller's allocation policy without episodic resets. We first extend the Vickrey-Clarke-Groves (VCG) mechanism to sequential auctions, thereby obtaining a dynamic counterpart that preserves the desired properties: efficiency, truthfulness, and individual rationality. We then focus on the online setting and develop a reinforcement learning algorithm for the seller to learn the underlying MDP and implement a mechanism that closely resembles the dynamic VCG mechanism. We show that the learned mechanism approximately satisfies efficiency, truthfulness, and individual rationality and achieves guaranteed performance in terms of various notions of regret.
- North America > United States > New York > New York County > New York City (0.14)
- North America > United States > Illinois > Champaign County > Urbana (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- (3 more...)
- Banking & Finance (0.67)
- Education > Educational Setting > Online (0.41)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)
Adversarial Robustness in One-Stage Learning-to-Defer
Montreuil, Yannis, Yu, Letian, Carlier, Axel, Ng, Lai Xing, Ooi, Wei Tsang
Learning-to-Defer (L2D) enables hybrid decision-making by routing inputs either to a predictor or to external experts. While promising, L2D is highly vulnerable to adversarial perturbations, which can not only flip predictions but also manipulate deferral decisions. Prior robustness analyses focus solely on two-stage settings, leaving open the end-to-end (one-stage) case where predictor and allocation are trained jointly. We introduce the first framework for adversarial robustness in one-stage L2D, covering both classification and regression. Our approach formalizes attacks, proposes cost-sensitive adversarial surrogate losses, and establishes theoretical guarantees including $\mathcal{H}$, $(\mathcal{R }, \mathcal{F})$, and Bayes consistency. Experiments on benchmark datasets confirm that our methods improve robustness against untargeted and targeted attacks while preserving clean performance.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > France > Occitanie > Haute-Garonne > Toulouse (0.04)
- Asia > Singapore > Central Region > Singapore (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Massachusetts (0.04)
- Asia > India > Rajasthan (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > Strength High (0.68)